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pH , Abstract 

. , We review three vector encodings of Bayesian network structures. The 

rl^ ' first one has recently been applied by Jaakkola et al. [J, the other two 

jrt I use special integral vectors formerly introduced, called imsets [111 113) . 

The central topic is the comparison of outer polyhedral approximations of 

the corresponding polytopes. We show how to transform the inequalities 

suggested by Jaakkola et al. to the framework of imsets. The result of our 

fT^ . comparison is the observation that the implicit polyhedral approximation 

^ ' of the standard imset polytope suggested in [T5j gives a closer approx- 

OO , imation than the (transformed) explicit polyhedral approximation from 

^^ ' [3]. Finally, we confirm a conjecture from [TJ] that the above-mentioned 

implicit polyhedral approximation of the standard imset polytope is an 
LP relaxation of the polytope. 



1 Introduction 

Bayesian networks (BNs) are popular graphical statistical models widely used 
both in probabilistic reasoning [B] and statistics [S]. They are attributed to 
acyclic directed graphs whose nodes correspond to the variables in consideration. 
The motivation for this report is learning the BN structure [7] from data by 
maximizing a quality {— scoring) criterion. The criterion is a real function of a 
BN structure {— of a graph) and of a database; its value says how much the BN 
structure given by the graph is good to explain the occurrence of the database. 

However, different (acyclic directed) graphs can define the same statistical 
model, in which case the graphs are Markov equivalent. Thus, a usual require- 
ment on the criterion is that it should be score equivalent, which means, it 
ascribes the same value to equivalent graphs. Another traditional technical re- 
quirement is that the criterion should be decomposable - for details see [5]. 
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Since the aim is learning the BN structure (= statistical model) some re- 
searchers prefer to have a unique representative for every BN structure and to 
understand the criterion as a function of such unique representatives. A tradi- 
tional unique graphical representative of the BN structure is the essential graph 
of the corresponding Markov equivalence class of acyclic directed graphs, which 
is a special graph allowing both directed and undirected edges - for details see 

m- 

The basic idea of an algebraic approach to learning, proposed in connection 
with conditional independence structures |11] , is to represent every BN structure 
by a certain integral vector (= a vector with integers as components), called the 
standard imset. This is also a unique BN representative. The advantage of this 
algebraic approach is that every score equivalent and decomposable criterion 
becomes an affine function of the standard imset. 

It has been shown in [T^] that the standard imsets are vertices of a certain 
polytopc, called the standard imset polytope. This allows one to rc-formulatc the 
learning task as a linear programming (LP) problem. However, to apply standard 
LP methods one needs the polyhedral description of the polytopc. In [M], a 
conjecture about an implicit polyhedral characterization of the standard imset 
polytope has been presented. The weaker version of the conjecture was that the 
polyhedron given by those inequalities is an LP relaxation of the polytope. 

Suitable transformation of an LP problem often simplifies things. There- 
fore, in |T3], an alternative algebraic representative for the BN structure, called 
the characteristic imset, has been introduced. It is obtained from the standard 
imset by an invertible affine transformation; however, unlike the standard im- 
set, the characteristic imset is always a zero-one vector. This opens the way to 
the application of advanced methods of integer programming (IP) in this area. 
Nonetheless, the crucial question of polyhedral characterization of the (trans- 
formed) polytope remain to be answered. 

Jaakkola et al. [3] have also proposed to apply the methods of linear and 
integer programming to learning BN structures. They have used a straightfor- 
ward zero-one encoding of acyclic directed graphs and transformed the task of 
maximizing the quality criterion to an IP problem. The main difference is that 
their vector codes are not unique BN representatives. On the other hand, they 
provide an explicit polyhedral LP relaxation of their polytope, which allows one 
to use the methods of IP. 

In this report, wc transform the inequalities suggested by Jaakkola et al. to 
the framework of imsets. First, we show that the implicit polyhedral approxima- 
tion of the standard imset polytope suggested in |14j gives a closer approxima- 
tion than the (transformed) explicit polyhedral approximation from [3]. Second, 
we show that the transformed inequalities give an explicit LP relaxation of the 
standard/characteristic imset polytope. A consequence of this fact is the proof 
of the weaker version of the conjecture from [14| . 



2 Notation and terminology 

Throughout the paper iV is a finite set of variables which has least two elements: 
\N\ > 2. Its power set, denoted by V{N), is the class of its subsets {A; A C N}. 
For any i = 1,2, we use a special notation 

Vt{N) = {A'ZN- \A\ > £} 

for the class of subsets of N of cardinality at least £. The symbol U C V will 
mean U CV,U ^V. 

We deal with directed graphs (without loops) having N as the set of nodes 
and call them directed graphs over N. Such a graph is specified by a collection 
of arrows j — >■ i, where i,j € N, i ^ j; the set p&Qii) = {j € N; j ^ i} is 
(called) the set of parents of node i € N. A directed cycle in G is a sequence 
of nodes ii, . . . , i„, n > 3 such that v — > v+i in G for r — 1, . . . , n — 1 and 
i„ = ii. A directed graph is acyclic if it has no directed cycle. A well-known 
equivalent definition is that there exists an ordering ii, . . .iijvi of nodes of G 
consistent with the direction of arrows in G, which means v — >■ «s in G implies 
r < s. Clearly, every acyclic directed graph G has at least one initial node, that 
is, a node i with paQ{i) = 0. 

We also deal with real vectors, elements of M^^, where M is a non-empty 
finite set. By lattice points in R^^ we mean integral vectors, that is, vectors whose 
components are integers {— elements of Z^^). In this paper, M has additional 
structure; typically, it is V{N) or V2{N), in which cases the lattice points are 
called imsets. To write formulas for imsets we will use the following notation: 
given A C N, the corresponding basic vector will be denoted by Sa'- 

^^^•^^^l a SCN, S^A. 

A special semi- elementary imset u/a,b\c) is associated with any (ordered) triplet 
of pairwise disjoint sets A,B,G C N: 

U(A,B|c> = Sc — Sauc - Sbuc + Saubuc , 

which, in the context of |llj . encodes the corresponding conditional indepen- 
dence statement A AL B\C. The imsets will be denoted using sans serif fonts, 
e.g. u or c; general vectors by bold lower-case letters, e.g. b or rj. They are 
interpreted as column vectors. 

Matrices will be denoted by bold capitals, e.g. A or C. The symbol A 
denotes the transpose of A. An invertible matrix A is unimodular if it is integral 
(= has integers as entries) and its determinant is -1-1 or —1 (see §4.1 in [S]); an 
equivalent definition is that both A and its inverse A~ are integral, that is, 
the mappings b i— >■ Ab and c i— > A~^c ascribe lattice points to lattice points. 

By a full row rank matrix we mean an m x n-matrix which has m linearly 
independent columns (= has rank m). The concept of unimodularity was ex- 
tended in § 19.1 of [5] to matrices of this kind. A full row rank m x n matrix A 



is unimodular if every m x 771-submatrix has determinant +1, —1 or 0; equiva- 
lently, if any of its invertible m x 7n-submatrix B is unimodular. A matrix A 
is totally unimodular if any of its (square) submatrix has determinant +1, or 
-1. 

We also deal with special classes of subsets of N . More specifically, we will 
consider non-empty classes A of non-empty subsets of A^ which are closed under 
supersets. These are classes % ^ AQ Vi{N) satisfying 

S eA, S CTCN => TeA. 

Every such class A is characterized by the class ^min of its miniimal sets with 
respect to inclusion: 

Anun = {SeA;'^TcS T^A}. 

Of course, I = ^min is a non-empty subclass of 'Pi(-/V) consisting of incomparable 
sets, which means 

Conversely, given a non-empty class 1 QVi {N) of incomparable sets the corre- 
sponding class A closed under supersets satisfying I = Anin is as follows: 

A^{S<ZN;3T (^I TCS}. 

Finally, in the proofs, we sometimes use Dirac's delta-symbol to shorten the 
notation. Specifically, the notation S{-k-k), where •• is a predicate {— statement), 
means a zero-one function whose value is -1-1 if the statement ir*: is valid and 
whose value is if the statement •* does not hold. 



3 Three ways of encoding Bayes nets 

3.1 Straightforward zero-one encoding of a directed graph 

Jaakkola et al. U used a special method for vector encoding (acyclic) directed 
graphs over N. Their 0-1-vectors rj have components indexed by pairs {i\B), 
where i € N and B C N \ {i}. Although their intention was to encode acyclic 
directed graphs only, one can formally encode any directed graph in this way. 
Specifically, given a directed graph G over A^, the vector t]q encoding G is 
defined as follows: 

7]G{i\B) ^ I 4^ B = pa.Q{i), ryG(i|_B) = otherwise. 

Example 1 Consider A^ — {a, b, c] and G : a ^ & <— c. It is a directed graph, 
but not an acyclic one. We have paQ{a) = {&}, pag.(6) = {a,c}, paQ^c) = 0. 
Thus, riG{a\{b}) = 1, 77G(6|{a, c}) = 1, ryG(c|0) = 1, and r]G{i\B) = otherwise. 

The polytope studied by Jaakkola et. al. |4] is defined as the convex hull of 
the set of vectors rjQ, where G runs over all acyclic directed graphs over A^. 



3.1.1 Jaakkola et al.'s polyhedral approximation 

The (outer) polyhedral approximation J of the above polytope proposed in [1] 
is given by the following constraints: 

• "simple" non-negativity constraints: 

il{i\B) > for every ie N, B CN\{i} (1) 

(|A^| • 21^1^-'^ inequality constraints), 

• equality constraints: 

Yl VU\B) = 1 for aU j eN (2) 

BCN\{j} 

{\N\ equality constraints), 

• cluster inequalities, which correspond to sets C C N, \C\ > 2: 

i^E E vm)^j2 E '?(*i^) (3) 

ieC BCAr\{i},BnC=0 ieC DCN\C 

(2l^l — |iV| — 1 cluster inequalities). 
Taking into account the equality constraints ([2]) for i G C, Q takes the form 



i^E 



iec 



1- J2 '?(*i^) 

BCAr\{i},BnC#0 



Remark No cluster inequality for C = is defined; the cluster inequalities for 
|C| = 1 are omitted because they follow trivially from the equality constraints. 

Example 2 In case N = {a, 6, c} every r;-vector has length 12 and its compo- 
nents decompose into three blocks that correspond to variables a, b and c. Thus, 
one has twelve non-negativity constraints, three equality constraints and four 
cluster inequalities of two types: 

• 1 < viaW) + vH{c}) + vim + v{b\M), (for C = {a, b}) 

• 1 < ?7(a|0) + v{b\9) + ?7(c|0). (for C ^ {a, b, c}) 

The constraints ([T|) and © are clearly valid for any vector rjQ of a directed 
graph G; the inequalities ^ hold in the acyclic case - see Lemma SI 



3.1.2 Jaakkola et al.'s approximation is an LP relELxation 

The polyhedral approximation from ii l3.1.1l is an LP relaxation of the corre- 
sponding polytope, by which we mean that the only lattice points in the ap- 
proximation are the lattice points in the polytope. First, we observe that the 
polyhedron J given by non-negativity and equality constraints is an integral 
polytope. 

Lemma 3 Let J' be the polyhedron given by |I]j and (0). Then J' is a polytope 
whose vertices are just the codes of (general) directed graphs over N . Moreover, 
the only lattice points in J' are its vertices. 

Proof. Let rj belong to J'. For every block of components of rj corresponding 
to i (^ N, the constraints define a vector in a "probability simplex" . Assuming 
ry is a vertex of J', for each i G N, the respective block has to be a vertex of 
that simplex, that is, a 0-1-vector having just one component 1. If B{i) is the 
set indexing such a component for i G N, we get the corresponding graph G 
with r] = tJq by drawing arrows from the elements of B{i) to i, for every i £ N. 
Clearly, this defines a one-to-correspondence between (general) directed graphs 
over N and vertices of J'. 

Let J7 be a lattice point in J'. Within the block given hy i & N, components 
are non- negative integers. Thus, if one of them exceeds 1, the sum exceeds 1. 
Hence, ?7 is a 0-1-vector. At most one component in a block is 1 since otherwise 
the sum exceeds 1, and at least one is 1 since otherwise the sum is 0. D 

Lemma 4 Let J be the polyhedron given by constraints (QP-I^). Then the lattice 
points in J are exactly the codes of acyclic directed graphs over N . 

Proof. Every lattice point in J is a lattice point in J', and, therefore, by Lemma 
[3l encodes a (uniquely determined) directed graph G. 

Consider the cluster equality ^ for C C N, \C\ > 2 and the vector tjq 
(encoding a directed graph G). For every i £ G, the r]c{i\D) term is typically 
and only once 1, namely in the case D = paQ^i). Thus, the inner expression 
for i in ^, namely X]_DCAr\c VG{i\D) is either or 1. The latter happens if and 
only a paQ{i)r\G = 0. That means, the cluster inequality for C says there exists 
at least one i G G with paQ{i) fl C = 0. Of course, this is true if G is acyclic. 

Now, we are going to show the converse: the cluster inequalities for t/q imply 
that G is acyclic. We start with applying the cluster inequality for C == A^ and 
find ii G N with pag.(ii) = 0. Thus, ii is an initial node in G and we fix it. If 
|A^ \ {*i}| > 2 we take G ~ N \ {ii} and apply the cluster inequality for it. It 
says there exists 12 G C = A^\ {ii} with paQ{i2)r\G = 0, that is, pag.(i2) C {ii} 
(= Z2 is the initial node in the induced subgraph G^vyjij}). 

Again, if |A^ \ {11,^2}! > 2 we continue with G = N \ {ii, 12}, and so on. In 
this way, we find iteratively an ordering ii, . . . iijvi consistent with the direction 
of arrows in G. This already implies G is acyclic. D 



3.2 Standard imsets 

Standard imsets introduced [TT] have components indexed by subsets T C N. 
Given an acyclic directed graph G over N, the standard imset uq encoding G 
is defined as foUows: 

Ug=5n - 5$ + Y^ [^pac(») - 5{^}^PB^Gii) ] • 

A basic property of standard imsets is that they are unique representatives of 
Bayesian network structures. This means, one has Ug ~ ^h if and only if G 
and H are independence equivalent acyclic directed graphs (= define the same 
Bayesian network structure) - see Corollary 7.1 in [TT]. In [T2], it was proposed 
to study the standard imset polytope, defined as the convex hull of the set of 
vectors Ug, where G runs over all acyclic directed graphs with N vertices. 



3.2.1 Outer approximation of the standard imset polytope 

In |14| . an outer approximation of the standard imset polytope in terms of 
linear constraints was suggested. More specifically, three types of constraints 
were considered (for u = uq): 

• equality contraints: 

^u(T) = o, VjeTV Y. u(r) = o, (4) 

TCAT TCNJeT 

which implies that u-vectors are determined uniquely by their components 
u(T) for T CN, \T\ > 2, 

• specific inequality contraints of the form: 

where ^ is a non-empty class of non-empty subsets of N, closed under 
supersets, 

• non-specific inequality contraints of the form: 

(m,u}= ^m(T).u(r)>0, (6) 

TCN 

where to is a (representative on an extreme standardized) supermodular 
function. Here, by a supermodular function is meant a real function m on 
the power set V{N) (= a vector in MJ^^^^) such that 

m{E iJF)+ m{E H F) > m{E) + m{F) for every E,F C N. 

It is standardized if m{T) = whenever |r| < 1. 
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Note that the class of standardized supermodular functions on V{N) is a pointed 
rational polyhedral cone, and, therefore, has finitely many extreme rays. Each 
extreme ray contains a uniquely determined non-zero lattice point whose com- 
ponents have no common prime divisor (this is the representative of the extreme 
ray). Therefore, ([B]) gives in fact finitely many linear inequality constraints on 
u = uq. The problem is that one has to compute those representatives of extreme 
supermodular functions, which is a difficult computational task. The represen- 
tatives were computed for |A^| < 5 [TU] . 

Thus, in comparison with the polyhedral approximation (of the rj-polytope) 
mentioned in 5 13.1.11 this polyhedral approximation (of the standard imset poly- 
tope) is implicit. This is a disadvantage from the practical point of view because 
to apply common methods of linear programming one still needs to explicate 
the considered inequality constraints for any |iV|. 

Example 5 In case N = {a, 6, c} every u-vector has the length 8. There are 
four equality constraints @ which break into two types: 

• u(0) = -u(a) - u(5) - u(c) - u({a, &}) - u({a, c}) - u({6, c}) - u({a, fo, c}), 

• u(a) = -u({a, b}) - u({a, c}) - u({a, 6, c}). (for j = a) 

Therefore, the dimension (of the standard imset polytopc) is 4 and the u-vcctors 
are determined by their components for sets {a, 6}, {a, c}, {&, c} and {a, h, c}. 

As concerns specific inequality constraints, every non-empty class of A of 
non-empty subsets of N closed under supersets is uniquely determined by the 
class vAmin of its minimal sets with respect to inclusion. One has eighteen such 
classes which break into eight types. For example, .Amin — {a6, ac, he} gives the 
inequality 

u({a, 6}) + u({a, c}) -I- u({fe, c}) -I- u({a, 6, c}) < 1 . 

As concerns non-specific inequality constraints, the cone of standardized super- 
modular functions has five extreme rays in case |iV| = 3 [TU], which leads to five 
inequalities breaking into three types: 

• u({a, &, c}) > 0, 

• u({a,&}) + u({a,6,c}) > 0, 

• u({a,&})-|-u({a,c})-|-u({6,c}) + 2-u({a,6,c}) > 0. 

Note that the described system of inequalities can be reduced; some of the 
specific inequalities appear to follow from the non-specific ones in combination 
with equality constraints and other specific inequalities. For example, if A^ain 
consists of one singleton only, then the respective specific inequality ^ is vac- 
uous because it trivially follows from the equality constrains (|4]). Actually, all 
specific inequalities with ,4niin containing a singleton are superfluous in case 
|A''| = 3. However, this is not true in case |7V| > 4. 



The constraints (H])-® were conjectured in [T3] to completely characterize 
the standard imset polytope and this conjecture was verified for |A^| < 4. Nev- 
ertheless, one perhaps does not need a complete facet description (= polyhedral 
characterization) of the polytope. To apply some advanced methods of integer 
programming the confirmation of a weaker version of the conjecture might be 
enough. The weaker version of the conjecture from [T3] is that the polyhedron 
given by (jj])-® is an LP relaxation of the standard imset polytope. 

Before writing this report, we confirmed computationally the weaker version 
for |A^| = 5. The extreme rays of the cone of supermodular functions for |iV| = 
5 were obtained from [TO] and independently computed using 4ti2 [T7], thus 
giving the non-specific inequality constraints (|6]). Specific inequality constraints 
([5]) were obtained from [Tl], where it was also calculated that there are 8, 782 
standard imscts for |A^| = 5. Since the characteristic imscts (described in ^ \'3.'6\i 
are 0-1-vectors and are in one-to-one correspondence to the standard imsets, 
we simply enumerated all vectors in {0, 1} ^ , applied the inverse transform 
(fTTj) to get the corresponding u-vectors, and tested whether they satisfied the 
above inequalities. By operating over V2iN), and properly modifying the above 
inequalities, the equality constraints ^ were satisfied. We verified that there 
were exactly 8, 782 integer solutions to the constraints (jj])-® for |iV| = 5. 

3.2.2 Tj to standard imset 

Taking into account the definition of tjq, it is easy to see that uq is obtained 
from t/q by applying the following mapping ?7 h- > u^. For any T C N , wc put 

u^{T)^S^{T)-S,{T) + Y^ ^ V^B) ■ {Sb{T) - S[,^^s{T)} . (7) 

ieN BCN\{i} 

This is clearly an affine mapping, ascribing lattice points to lattice points. As- 
suming T] belongs to the linear subspace specified by equality constraints ([2]), 
we re- write (0 as follows: 

u'^{T) = SM{T)~5,iT) 

+ E'?(*l^)-{^0(^)-^w(r)} + E E v{i\B) ■ {3b{T) - S^^y^siT)} 



i) 



ieN ieiV0/BCJV\{i} 

<5jv(r)-50(T) + E{l- E v{i\B)} ■ {5,{T) - 5^,y{T)} + . 

iSN 07^SCJV\{i} 

SN{T) + {\N\-i)-S,{T)-J2Su}iT) 



u<»(T)ei, 
- E E v{i\B) ■ {S,{T) - Si,y{T) - 5b{T) + S^.^ubJT)} , 

i€N lij^BCN\{i} ^ ' 

'J{>,i3|0){r)e{-l,O,+l} 

where u denotes the standard imset corresponding to the empty graph over N 
and U(^iB\iii) the semi-elementary imset encoding i IL B \il}. 



Briefly, if r] satisfies ^ then 

ieN li^BCN\{i} 

In particular, u = u^ belongs to the linear subspace specified by equality con- 
straints (HJ. This is because these equalities hold for both u** and any U(i.B|0)- 
Note that the converse is true as well (we leave an easy proof to the reader): if 
u satisfies @ then there exists rj satisfying ^ such that u = u''. In particular, 
(m is the exact translation of © into the framework of standard imsets. 

3.3 Characteristic imsets 

The characteristic imset (for an acyclic directed graph G), introduced in [13], 
is obtained from the standard imset by an affine transformation. More specif- 
ically, first, the portrait Pq of the standard imset uq is obtained by a linear 
transform; second, the portrait is subtracted from the constant 1-vector and the 
characteristic imset Cq is obtained: 



p(5) = V u(T) ioiSCN, (8) 



E 

T.SCTCN 



c{S) = 1 - p(5) for SCN. (9) 

Clearly, the equality constraints ^ are translated into the following tacit re- 
strictions on c-vectors: 

c{S) = 1 ior SCN,\S\<1. (10) 

Therefore, for an acyclic directed graph G over N, the components of the char- 
acteristic imset Cg for |5'| < 1 are ignored and cq is formally considered to be 
an element of Z^^^^^ 

The mapping u t-^ c determined by (jH])-® is invertible: one can compute 
back the standard imset by the formula 

u(T)= J2 (-1)'^^^' ■[l-c(S')] forTCiV. (11) 

Indeed, to see it fix 5 C A^, substitute dTl]) (with S replaced by D) into the 
expression for the portrait p(S') and change the order of summation: 

E ^(^) - T. T. (-i)'^\^'-p(^) 

T,SCTCN T,SCTCN D,TCDCN 



E P(^)- E (-i)i^\^i = p(5) 



D.SCDCN T,SCTCD 



Ss{D} 

Since the transformation is one-to-one, two acyclic directed graph G and H are 
independence equivalent if and only if cq = Ch- Thus, the characteristic imset 
is also a unique Bayesian network structure representative. 
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3.3.1 Advantage of characteristic imsets 

Since standard and characteristic imsets are in one-to-one correspondence, one 
can transform the inequahty constraints from i) l3.2.1l into the framework of char- 
acteristic imsets - see i) l4.1.3l and i) l4.2l for further details. One important conse- 
quence of these transformed constraints are basic inequalities for characteristic 
imsets vahd in the acychc case: 

Corollary 6 The constraints ^-(^ on u imply the inequalities < c{S) < 1, 
S C N for the imset c ascribed to u by (0)-(0)- 

Proof Because of dH), we show < p(S') < 1 for 5 C TV. First, (g]) says p{S) = 
for 151 < 1. Given 5 C iV, |5| > 2 the class of sets A = {T; S C T C N} is 
closed under supersets and, by (O, p{S) < 1. On the other hand, in ©, among 
the (representatives of extreme) supermodular functions we find the function 

mstm^l 1 if5cr, 

[0 otherwise. 

In particular, among the non-specific inequality constraints is the inequality 
P(5) = Et, sct ^(T) = ("i^^ u) > 0. D 

In particular, every characteristic imset cg (for an acyclic directed graph G) 
is a 0-1-vector, which is a fact emphasized already in [13j . which is important 
from the point of view of (possible future application of) methods of integer 
programming. 

Another advantage of characteristic imsets is that they are closer to the 
graphical description (of Bayesian network structures) than standard imsets. 
Specifically, for 5 C A^, [S*] > 2 one has 

cg{S) = 1 ^ there exists i e S with 5 \ {i} C paQ{i), (12) 

and there exists a polynomial algorithm for transforming the characteristic imset 
cg into the respective essential graph, which is a traditional unique graphical 
representative of the Bayesian network structure given by G - see [13] . 



3.3.2 rj to characteristic imset 

Lemma 7 The characteristic imset Cq is a linear function of tJq given by 

<^^=Y1 Yl '?(*l^) where \S\>1. (13) 

ieS B,S\{i}CBCN\{i} 
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Proof. Given S C N, substitute ([7]) into (|S]) and change the order of summation: 

p(S) = Yl [5n{T)-S,{T) + J2 E r?(i|B).{5s(r)-5„uB(r)}] 

T,SCTCN ieN BCN\{i} 

T.SCTCN T,SCTCN 

+ E E '?(^|S)-{ E '^^(2^)- E '5{0us(T)} 

ieiVSC]V\{i} T,SCTCN T, SCTCN 

= 1-50(5) + ^ ^ ^(i|i3)-{<5(5CB)-5(SC{z}uB)}. 

ieiV BCN\{i} 

Reahze that the expression S{S Q B) — 5{S C {{] U B) vanishes if either S C B 
or S \ {{i} U B) 7^ 0, otherwise it is —1. Thus, assuming 15*1 > 1, one has 

p(5) = 1 + E E vm)-{^l)-6{ieS, Sc{i}uB) 

ieN BCN\{i} 

= 1-E E ^(*ii?)-^(^\«ci?), 

iGSBCAf\{i} 

because, in case i € S, then S C {i} U B is equivalent to S* \ {i} C i? . Taking 
i^ into consideration we get p^ . D 

Let us call the mapping given by (|13p the characteristic transformation. It 
can formally be applied to any rj-vcctor, in particular, to the code tjq of a 
general directed graph G. Thus, we get a formula for the "quasi-characteristic" 
imset (= an element of 'L^'^^^^) ascribed to a graph over N: 

Cg{S) = number of super-terminal nodes in 5* for 5* C N, \S\ > 2. (14) 

Here, a super-terminal node (in S) means i £ S such that for all j £ S \ {i} 
one has j — > i in G. Indeed, having fixed S, \S\ > 2 and i £ S, the expression 
T,B., s\{t}CBCN\{i} VG{-i\B) is either or 1 depending upon S \ {i} C padi). 
Observe that (fT^ is a special case p^ since, in case of an acyclic directed 
graph, any set S has at most one super-terminal node. 

Example 8 Consider the graph G from Example [TJ Then ccda, c}) = 0, 
CG{{b,c}) — CG{{a,b,c}) = 1 and CG{{a,b}) = 2. Observe that cg docs not 
satisfy the basic constrains < c < 1 valid in acyclic case. This is because G is 
not acyclic. 



4 Transformation of inequality constraints 

In ii l3.2.2l and 5 13.3.21 we have described mappings which transform the ry- vectors 
used by Jaakkola et al. [4j to standard/characteristic imsets. The advantage of 
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the jy-polytope is the existence of a good (= explicit) outer polyhedral approx- 
imation (see Lemma |4] in ij |3.1.ip . In this section, we characterize the image of 
that polyhedral approximation (by the above maps) and compare the trans- 
formed approximation (of rj-polytope) with the approximation of the standard 
imset polytopc from ii l3.2.1l The main technical difficulty we have to tackle is 
that the mappings transforming 77- vectors to imsets are many-to-one. Another 
feature is that the transformation raises the number of linear constraints. To 
clarify the reasons for that, in i) l4.1l we first deal with the transformation of 
elementary constraints ([I])-© and, later, in ii l4.21 with the transformation of 
cluster inequalities ([3|). 

4.1 Transformation of elementary 77-constraints 

Now, the question of our interest is to transform the constraints ([I|)-(I1]) only, 
that is, to characterize the form of the inequalities of the image of the polyhedron 
J' from Lemma 131 Let us start with an example, illustrating our method. 

Example 9 Consider N = {a,b,c}, the polyhedron J' and the characteristic 
transformation 77 1— > c given by (|13p . The idea is to transform each vertex of J' 
and take the convex hull R of the images of vertices. Because of linearity of the 
map r/ !->■ c, the polytope R is the image of J'. Thus, it is enough to find the 
facet description of R; this is the exact translation of ([I])-© then. 

The vertices of J' arc exactly the codes of general directed graphs (see Lemma 
[3]) and their images are given by (|14p . Thus, the (permutation type representa- 
tives of) images of vertices of J' were obtained in this way. Here they are (the 
order of component is a&, ac, be, abc) : 

[0,0, 0,0], [1,0, 0,0], [2, 0,0,0], [2, 1,0,0], [1,1, 0,0], [1,1, 1,0], 
[1,1, 0,1], [2, 1,0,1], [2, 2, 0,1], [1,1, 1,1], [2, 1,1,1], 

[2, 1,1, 2], [2, 2, 1,2], [2, 2, 2, 3]. 

Remaining images can be obtained by permutation of first 3 components. We 
computed the facet-description of their convex hull R by Polymake [3]. The 
result had fifteen inequalities. Here, we only recorded the (permutation) types 
of obtained inequalities: 

• < c(a6), 

• < 2-c{ab), 

• < 3 — c{ab) — c(ac) — c{bc) + c{abc), 

• < c{abc), 

• < 1 + c{ab) -c{abc), 

• < c{ab) + c{ac) — c{abc), 



13 



• < c{ab) + c{ac) + c(bc) - 2c{abc). 

To make sure we computed the vertices of the polyhedron given by these in- 
equahties. The (permutation) type representatives are as follows: 

[0,0, 0,0], [2, 0,0,0], [2, 1,0,0], [1,1, 0,1], [2, 1,0,1], [2, 2, 0,1], [2, 1,1, 2], [2, 2, 2, 3]. 

We observe that some of images of vertices of J' are convex combinations of the 
others: for example, [1,0,0,0] comes from [0,0,0,0] and [2,0,0,0]. Note that 
the original polyhedron J' was given by twelve inequalities (and three equality 
constraints). Since R is given by fifteen inequality (and four implicit equality) 
constraints, the transformation to the framework of characteristic imsets raised 
the number of inequality constraints. 

Another interesting observation is that the obtained fifteen inequalities in 
fact coincide with the translation of specific inequality constraints ([5]) to the 
framework of characteristic imsets in case N = {a, b, c} - see Example [M] for 
details. 

This leads to a natural conjecture that Jaakkola et al.'s elementary con- 
straints ([I])-© are equivalent to our specific constraints for any |A^|. We con- 
firm this conjecture below, directly by considering the transformation of 77 t— > u. 
Later, we transform the specific constraints to the framework of characteristic 
imsets (see ii l4.1.3p . 

4.1.1 Translation to the framework of standard imsets 

Thus, the task is to characterize in terms of u the image (by tj 1— > u^) of the 
polytope J' given by non-negativity and equality constraints. More specifically, 
we wish to have a finite system of linear inequalities on u which together with 
(m - see i) l3.2.2| - characterize those u G M''''^-' for which 

3r] satisfying (H]),© and u''(r) = u(T) for any T C A^, |r| > 2 . (15) 

This task can equivalently be formulated as follows. Let us put m = 21^' — 1, 
n=|iV|-2l^l~^ and consider a special m x n matrix A, whose 

• rows correspond to sets T C N, \T\ > 1, 

• columns correspond to pairs {i\B) where i ^ N, B C N \ {i}. 

More specifically, the entry a[T,{i\B)] oi A is given by 



a[T,mB)] = %us(T)-5i5(T) if|T|>2, 

a[T,{i\B)] = %(T) if |T| = 1. 



(16) 
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Moreover, to any u € R'^^^^), we ascribe a column to- vector 6u whose compo- 
nents bu[T] are specified as follows: 

bAT] = SN{T)-uiT) if |T|>2, 
6u[r] = 1 if |r| = i. 

Then ((T5|) is equivalent to the condition 

3?7eR" satisfying ?7 > and A?7 = bu- (17) 

Indeed, |I]) means rj > 0, while (0) for j & N is the requirement that the component 
ofbu for T — {j}, which is 1, coincides with the respective component of Arj: 

l=^a[r,(»|B)l.r,(z|B) = ^ ^ Si,^{{j}) ■ v{i\B) = ^ v{J\B) . 

(i\B) ieN BCN\{i} BCN\{j} 

Analogously, for fixed T C TV, |r| > 2, u(r) = u^(T) ias, by 0), the form 

u{t) = Sm{t)-J2 E {^wuflCr)_Mr)}-r?(i|B) 

ieN BCNUi}^ " ' 

a[T,(i|S)] 

and can be expressed equivalently as 

J2 E a[T,{i\B)]-7^{i\B) = SM{T)-u{T) = b,{T), 

ieN BCN\{i} 

which means the components of Arj and bu for T coincide. 

Now, Farkas' lemma (see Corollary 7. Id in [5]) applied to A and 6j says that 
(J17p is equivalent to the requirement: 

Vj/eR" A^y>0 ^ b^y>0. (18) 

To simplify this requirement we re-write the condition A y > in this form: 

VieN y{{i})>0, (19) 

WSCN,\S\^2,yieS y{S)+y{{i})>0, (20) 

VSCN,\S\>3,yieS y{S)+yi{i})~yiS\{i})>0. (21) 

Indeed, the rows of A^ correspond to pairs {i\B), i G N, B C N \ {i}. If i £ N and 
B = ^ then the component of AJ y for (i|0) is as follows: 

E « [T, (»|0) ] ■ y{T) = E hniT) ■ y{T) = y{{t}) , 

because a[T, (i|0)] = for \T\ > 2. This gives (W^. If i € N, B (1 N \ {i} with 
\B\ = 1, then a[T, {i\B) ] = S^,^ub{T) for \T\ > 2 and one can write 

E a[T,m)]-y{T) 

(b^TCN 

= E ^{0(r) ■ y^T) + Y. <5{.}us(T) ■ y{T) = y{{i}) + y{{i} U B) , 



T| = l |T|>2 
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which leads to ^ for S = {i}UB. Finally, ifi£N,BCN\ {i} with \B\ > 2 then 
J2 a[T,ii\B)]-yiT) 

= Y. h^}iT) ■ y{T) + Y. {&{.}ub{T) - Sb{T)} ■ y(J) 

T| = l |T|>2 

= y{{i})+y{{i}yjB)-y{B), 

which leads to {HP for S = {i} U B. 

The next step is to show that {y G W^;A y > 0} is a pointed (rational 
polyhedral) cone and characterize its extreme rays. In fact, we show that the 
rays correspond to non-empty classes of sets A C Vi{N) closed under supersets. 
More specifically, we ascribe a vector yj^ 6 K™ to any such class A by: 

yAiT) = S{T eA)- \{j e N; {j} e ^ & {j} C T}| for T e Vi{N) . (22) 

Here is the crucial observation: 

Lemma 10 A vector y G R™ satisfies m9\) - i21\) if and only if it is a conic com- 
bination (= a linear combination with non-negative real coefficients) of vectors 
UA for classes % ^ A^ 'Pi{N) closed under supersets. 

Proof. First, we leave to the reader to verify that any such vector y^ satisfies 
p^ - (PT|) . which implies the sufficiency of the condition. 



To verify the converse implication, we ascribe to any y G R™ satisfying 
(O-dlTI) the class of sets 

Ay = {Se ViiN); 3Te Vi{N), TCS y{T) ^ 0} , 

which is clearly closed under supersets and non-empty if y 7^ 0. The idea is to 
prove the converse implication by induction on \Ay\. If \Ay\ = then y = and 
the claim that y is a conic combination of those vectors is evident. If \Ay\ > 1 
then it is enough to find some /3 > such that y' = y — (i -y^i, satisfies P^ -(PT |) 
and \Ayi\ < \Ay\. 

Since now we fix y G M™, y 7^ satisfying ()19p -([2T |) and put: 

A = Ay, y, =yA, Y = {i e N; y({^}) ^ 0} . 

Observe a few basic facts: 

y{S) =0 for 5 G 7^1 (N) \A, y (5) > for S G A^in ■ 

Indeed, assuming S G Amin one has y{S) / 0. If \S\ = 1 then if79l) implies y{S) > 0. 
If \S\ = 2 then {i} A for both i G S. Hence, y{{i}) ^ Q and ^ gives y{S) > 0. If 
\S\ >3 andieS, then both {i} (^ A and S \ {i} i^ A and {2IP gives y{S) > 0. 
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In particular, since {j} € Anin for j &Y, and {i} ^ A ior i ^Y, 

13 = min {y(T); T G A„i„} > 0, and y{{j}) > /3 > for j' e F, (23) 

y({i}) = ioTieN\Y. (24) 

Further, we observe that y is non-decreasing set function on subsets of A^ \ 1". 
Indeed, it is enough to show T C S C N \Y, \S \T\ = 1 ^ y{S) > y{T). Taice 
S\T = {i}; then i ^ Y and y{{i}) = 0. If \S\ = 2 then T = {j} with j ^ Y and 
y{S) = y{S) + y({i}) > = y{T) follows from HOP and ^. If \S\ > 3 then HIP says 
y{S) + 0-y{T)>0. 

This imphes: 

SeA, SDY = ?l => yiS)>f3. (25) 

Indeed, it is enough to End T G ^min, T C S (of course, T nY = 0) and combine 
y{S) > y{T) with y{T) > /?, which follows from the dehnition of (3 in if23l) . 

Finally, also have: 

S'e7'i(iV), ISTiri < 1 ^ y{S)>Q. (26) 

Indeed, this was veriBed in cases |S| = 1 and \S r]Y\ — in i)2,^l) - i)25|) . Assume 15*1 > 2 
and |5 n y| = 1 and use the induction on \S\. If \S\ = 2 then S = {i,j} with i ^ Y 
and j eY and ^+^ give y{S) > -y{{i}) = 0. If \S\ > 3 then choose i e S\Y 
and write by m + IW viS) > y{S \ {i}) ~ y{{i}) = yiS \ {i}). Now, y{S \ {i}) > 
follows from the induction premise. 

To smooth later considerations let us gather the observations about y* = y^ 
defined in ([22]) . For singletons we have: 

y,{{i}) = 1 ioTieY, y*i{i}) = for i ^ y. 

Given S C N, jS*! = 2 we have: 

y.{S)^l iiSnY = 9, SeA. 

y^iS) =0 if either [S HY = IJ} k S <^ A] or \S r\Y\ = I. 

y4S) = -i ifs-cy. 

For 5 C iV, 1^1 > 3 we have: 

y*{S) = l if5ny = 0, 5eA 

y*{S) = if5ny = 0, 5^A 

y^{S) = i~\snY\ iiSnY^d). 

To show that 

y' = y- (3 -y^ 

satisfies p9)) . that is, y'{{i}) > ioi i G N, we distinguish two cases. 

• li i ^ Y then y*({i}) = and (|19p for y implies the same equality for y' . 
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lUeY then v'{{i}) = y{{i})- P-V.{{i}) = y{{i})-p-l^ y{{i})-P> 
owing to (|23l) . 



To show that y' satisfies (EOl), that is, y'{S) + y'{{i}) > for S* C AT, |5| = 2 
and i € S we distinguish five cases. 

• If 5 C y then i £ F and y^{S) + y*i{i}) = (-1) + 1 = and (EO]) for y 
implies the same equality for y' , no matter what /3 is. 

• li \SnY\ = 1, i ^Y then y,(5) + 2/,({i}) = + = and (UHl) for y 
implies what is desired, for the same reason. 

.Ii\SnY\^l,teY then y'{S) + y'{{t}) = y{S) - /3 • y,{S) + y{{t}) - 
/3 • y*{{i}) = 2/(^) - /3 • + 2/({*}) - /3 • 1 = y(5) + yi{i}) - /3. However, 
2/({*}) ^/5 > by ((23)) and ?/(S') > by (|26)) . which implies what is desired. 

• IfSny = 0, 5^^ then i ^ y and y^{S) + y^{{i}) = + = and (Pj) 
for y implies what is desired, 
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li SDY ^9, S e A then i ^ F and by dMl) y'(5) + y'i{i}) = y{S) - 
P ■ y*{S) + y{{i}) - P ■ y.{{i}) = y{S) - /3 • 1 + - /3 ■ = y{S) - /3. The 
desired inequality follows from p5|) . 



To show that y' satisfies ^, that is, y'(S') + y'{{i}) - y'{S \ {i}) > for 
iS* C A^, 15*1 > 3 and i G S wc distinguish seven cases. 

• liSnY ^IJ}, S\{i}eA then S eA and i Y. Thus, y^S*) + y*{{i}) ~ 
y^,{S\{i}) = (+l) + 0— (+1) = and (|2T|) for y implies the same inequality 
for y'. 

• If5nr = 0, S*^^ (which implies S\{i} ^ A) then y^S) + y,({i}) - 
y*(5' \ {i}) = + — = and (|2T|) for y implies what is desired. 

• liSnY ^$, S eA, S\{i}^A then y({i}) = = y(5 \ {i}) and we can 
write y'(5) + y'({z})-y'(5\{*}) = y(5)-/3-y,(5)+y(W)-/3-y,({*})- 
y(S\{i}) + ^-y*(5\W) = y(5)-/3-l + 0-/3.0-0 + /3.0 = y(5)-/3, 
which is no n- negative by ([25]) . 



• If5nr7^0, i^r then 5 n r = (5 \ {i}) n y and yH,(5') + y*({i}) - 
y,(5 \ {^}) = (+1 - |5 n y|) + - (+1 - \{S \ {i}) n Y\) = 0. Thus, dUl) 
for y implies what is desired. 

• if5nr 7^0, ier, (^VjiDny 7^0 then is-ny] = i + K^MiDnyi and 
y.{S)+y4{^})-y*{s\{^}) = (+i-|5ny|)+(+i)-(+i-|(5\{z})ny|) = 

0. Thus, (PT|) for y implies what is desired. 



If 5 n y 7^ 0, i e y, (5 \ {i}) DY = 9, S\{i} e A then |5 n y| = 1 and 
y,{S) + y,({^}) - y,(5 \ {i}) = (+1 - 1) + (+1) - (+1) = and ^ for 
y implies what is desired. 
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• liSnY ^il),ieY,{S\ {i}) nY^^, S\{i}^A then also \SnY\ = l 
and y{S \ {i}) == 0, which ahows us to write y'{S) + y'{{i}) - y'{S \ {«}) = 
yiS) - /3 . y,(5) + y({z}) - /3 • y,({*}) - yiS \ {^}) + /3 • y^S \ {i}) = 
2/(5) - /3 • (1 - 1) + y({i}) - /3 . 1 - + /3 • = 2/(5) + y({i}) - /?. However, 
?/({*}) ~/5 > by (P5|) and y{S) > by (pS)) . which imphes what is desired. 

Thus, y' satisfies p^ - (PT|) and, because of the choice of /3, y'{T) = for 
at least one T G Amin and \Ay'\ < \Ay\, which concludes the induction step. 
Indeed, realize that, by dH]), y^T) = yA{T) = for T e Vi{N) \A. D 

Now, Lemma [10] allows us to re- formulate the requirement ([T8| in the form 
of finitely many conditions on u : 

'i%^ AQViiN) closed under supersets h^yA > . (27) 

Indeed, if y £ R™ is such that A7 y > and y — ^\a ■ VA, ^A > 0, then b^y — 
E >^A ■ bZyA > 0. 

It remains to reformulate, given such an A, the condition b^ yA > 0. Assum- 
ing \N\ > 2, denote for this purpose A = {i G N; {i} G .4} and write using the 
definition of 6u and yA from p2l) : 



JT|>1 |T|=1 1T1>2 

= Yl y^c^) + y^(^) - E "C^) ■ y^^T) 

|T|=1 |Tj>2 

= \A\ + (1 - |A|) - Y. u(r) . j/^(T) = 1 - ^ u(T) . y^(r) . 

^^^^ |T|>2 |T|>2 

Thus, bu 2/^ > is equivalent to X]|ti>2^(-^) ' 2^-4 (^) — 1- "T*^ S^* even more 
elegant form of it, assume u satisfies @ and observe 

^ u(r).|TnA|= ^ u(T).5]5(zer)= Y. E ^(^) "^l^ ^^) 

|T|>2 |T|>2 iGA |T|>2 ieA 

= Y.Y. ^(^) • '5(* e r) = ^ Y. -(T) i ^ -u({^}) . 

i£A \T\>2 ieA |T|>2,j:gT i£A 

Therefore, we can write by (P^ : 

5] u(r).2/^(r)= 5^ uiT)-SiTeA)- J2 <T)-\TnA\ 

T|>2 |T|>2 |T|>2 

= J2 u(r) • <5(T e ^) + ^ u({*}) = Y <T) ■SiTeA)^Y. ^(^) ' 

|T|>2 ieA |T|>1 TeA 

which means that bu j/^ > is equivalent to X^tgA ^(-^) — ^- Thus, under 
validity of (|H), (P7)) is equivalent to ([5]) and we have: 

Corollary 11 Provided \N\ > 2, the condition U5\) for u G R^^^^ is equivalent 
to the simultaneous validity of Ol) and ^. 
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4.1.2 Remarks on the matrix A 

Consider again the mxn matrix A defined in P^ : recall that m — 2^^^ — 1 and 
n = \N\ ■ 2l''^l~-'^. We have observed in ii l4.1.1l that A plays a central role in the 
transition from //-vectors to standard imsets. Now, we show that A has full row 
rank by deriving its its Hermite normal form (see §4.1 in [9] for this concept). 

Proposition 12 The matrix A has Hermite normal form, [I 0], where I is the 
m X m identity matrix and the m x [n ~ m) zero matrix. 

Proof. The columns of A are indexed by pairs {i\B) and given by 

A(m = 5{i} for i e N, 

A(i\j) = 5{i} + 5{i_j} for i,j e N.i^ j 

A(,|s) = (5{,} -5b+ S^,}ub iorieN,BCN\ {i}, \B\ > 2. 

Thus, S^iy = ■A{i\ii) and (5{ij} = ^(i\j) ^^(i|0)- To show by induction on \T\ > 1 
that St can be written as an integer combination of columns of A, assume 
\T\ > 3 and choose a pair {i\B) with T = {i}[J B, \B\ > 2. Then 

St = S[i}^JB = ^(j|B) - ^{i} + Sb, 

where, by the induction hypothesis, the terms S^iy and 5b can be written as 
integer combination of the columns of A. 

Thus, using elementary columns operations, A can be transformed such 
that it contains all m elementary column vectors 6t- Using additional column 
operations, all other columns can be zeroed out. Therefore, using elementary 
column operations, A can be transformed to the form [/ 0]. D 

Before writing this report, we verified computationally that A is unimod- 
ular, strongly unimodular, strongly k-modular, however not totally unimodu- 
lar for 3 < |A^| < 6 using software written by Matthias Walther available at 
Ihttps : //github . com/xEunmy/unimodularity-test' This led us to a hypothe- 
sis that A is unimodular for any \N\. In § 14.1.41 we confirm this hypothesis. 

4.1.3 Translation to the framework of characteristic imsets 

Wc observed in ii l4. 1.11 that Jaakkola et al.'s elementary constraints ([H)-© are 
transformed into u-constraints as ([4])- ([5]). Transforming (l4])-([5]) into c-constraints 
is a simpler task because of the one-to-one correspondence u O c (see § 13. 3p . We 
already know that (j4]) takes the form of tacit restrictions on c- vectors ([T0|) . As 
concerns the specific inequality constraints ([5]), we show below that every such 
inequality, for 7^ ^ C Vi{N) closed under supersets, is transformed into the 
framework of c- vectors as follows: 

< Y, ^a{S) ■ c{S) , (28) 



SCN 
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where the coefficients k_4(— ) are given by 

ka{S)= Y. (-1)''^^^' for 5 C TV. (29) 

TeA,TCS 

However, the formula ((29)) is not suitable to compute the coefHcients. It is more 
appropriate to introduce them cquivalently in terms of the class ^min = I of 
minimal sets in A. More specifically, let us introduce the class C{I) of possible 
unions of sets from a non-empty class I C 'Pi{N) of incomparable sets: 

C{I) = {SCN; 30^/CCI such that S = Utgk ^I- 

Then can can compute the coefficients k_4(— ) recursively as follows: 

ka{S)=0 iiStZN, S ^C{I), 

KAiS) = 1- E ^AiT) for 5 G C(I) . (30) 

TeC(I),TcS 

This implies that ka{S) = 1 for 5 £ ^min = ^ and that ka has the more zeros 
the smaller |^min| is- Therefore, in the framework of characteristic imsets, it 
is more convenient to ascribe the (transformed) specific inequality constraints 
directly to classes 7^ X C Vi{N) of incomparable sets. 

Lemma 13 Let u and c be imsets related by ^-^ and 7^ ^ C 'Pi{N) a 
class of sets closed under supersets. Then the inequality ^ corresponding to A 
has the form i2S\) . where the coeficients k^(— ) are given by iSOp . 



Proof. The first observation is that the coefficients given by ([29|) satisfy 

ka{S) =0 for 5 C iV, S* ^ A and ^ ka{S) = ^ ka{S) = 1 . (31) 

SCN seA 

To verify it realize that A C V{N) is closed under supersets and write: 

E-^(^) - E E (-i)'^^"' = E E (-1)'^'"' 

seA seA TeA.TCs TeA seA.TCs 

= E E (-i)|"\^i = Emt) = i. 

T£A S,TCSCN TeA 

To see that ^ is transformed into ([28]) we substitute the inverse formula (jlip 
into it and use the fact A is closed under supersets: 

1 > E^(^)-E E (-i)'"^^'-p(^) 

TeA TeA S.TCSCN 

- Y. i: P(5) • (-i)i^\^i = E p(^) • E (-1)'"^^'- 

SeA TeA.TCS SgA TeA,TCS 

k.a{S) 
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Thus, substitute ([5T|) iu that iuequality and get 

seA seA seA SeA "" ~y~ 

C(b) 



which is, owing to (|9]) and ((3T|) . nothing but (|28| . 

It remains to show that (P^l) takes the form ([50)1 . An auxihary fact is 

"iSeA EK^(r) = i. (32) 

TCS 

Indeed, to see it, consider the class ^5 = {T C S; T G A}, which is a class 
of subsets of S, closed under supersets. Moreoever, for any T S As, one has 
i^a{T) = KAs (T), which implies by ([3T|) applied to As and S in place of TV that 

1 - J2 ^As (T) = J2 ^a{t) = J2 ^a{t) . 

T&As TeA.TCS TCS 

In the rest of the proof we write I in place of ^min and omit the index in k^(— ) 
and write k(— ) only. For every /C C I we introduce the class of sets whose only 
subsets in I are elements of /C: 

Bk: = {S C N; K C S ioi K G IC & L \ 5* ^ for L e X \ /C }. 

Of course, it may happen that Bk: is empty for some /C C I. Nevertheless, 
the collection of classes Bjc, where /C runs over subsets of I, form a partition 
of V{N). Moreover, every non-empty class Bjc has the least set (in sense of 
inclusion), namely Sjc = Utgk; -^- Observe that /C = leads to a non-empty 
class Bi^ = 'P{N) \ A with S$ — 0. Since I consists of incomparable sets, every 
S G I belongs to just one Bjc with |/C| = 1, namely /C ~ {S}- The class C(I) 
defined above ([50)) then coincides with {Sic', ^ /C C I with Bk: 7^ }• 

An easy consequence of ([29]) is that n{S) = for S G Bij, {= S ^ A) and 
n{S) = 1 for S" e I. To verify ([5n|) it is enough to show by induction on |/C| the 
following two statements: 

(i) k[S) = Q iov S e Bk, S ^ Sk, 

(ii) V |/C| > 1 with Bk^% 1 = T.CCK. B,^$ <Sc). 

Indeed, this is because for £, /C C I with Be ^ ^ ^ Bk. one has £ C /C if and 
only if Sc C Sk.- We already know this is true in case \}C\ = 0. Now assume 
|/C| > 1 and the statements hold for any C d JC. Consider arbitrary S € Bk. and 
write using ([5^ and the fact that subsets of S must belong to Be for £ C /C: 



1 ^ Y<T)- E E <T) 

TCS CCK,Bc^$ TCS.TcBc 

E '^(^)+ E E ^^T). (33) 



TCS, Tei3K £C/C, B£#0 TCS, TciSc 

22 



Now, observe that the induction premise (i) apphed to any C C IC, Be ^ % 
says that k vanishes in Be except for Sc- In particular, for any T) C Be with 
Sc ^'D one has X^TeD ^i"^) = i^iSc)- This impUes that the second term in ([55]) 
^^ ^CGfC B 7^0 '^i^c) ^nd we have observed that 

TCS,TGi3/c CCK.Bci^lll 

which means the function S M- X^tcs tgb '*(-^) ^^ constant on Bk:- This allows 
one to derive (i) for /C, for instance, by induction on \S\ for S G Bic- If we apply 
dMl) to S* = 5k; we get (ii) for IC. D 

Example 14 Take TV = {a, 6, c} and classify types of considered classes A, 
specified by Amin- Using (|30l) we get the corresponding inequalities 



• Anin = {abc} leads to Kj[{abc) ~ 1 and k^(S') = otherwise. This gives 
the constraint < c{abc), 

• ^min = {o-b} leads to Kj{{ab) ~ 1 (and Kj{{S) ~ otherwise), which gives 
the constraint < c{ab), 

• -^min = {ab,ac} leads to Kx(a6) ~ Kj[{ac) = 1 and K_4(a6c) = ~1, which 
gives the constraint < c{ab) + c{ac) — c(a6c), 

• "^niin — {o.b, ac, be} leads to Kj{{ab) — Kjij^ac) ~ k^(6c) = 1 and Kj^iabc) = 
—2, which gives the constraint < c(a6) + c(ac) + c(bc) — 2c{abc), 

• Amin ~ {c} leads to ka{c) = 1 which gives < c(c), which is a vacuous 
constraint because of c(c) = 1 implied by ([TU|) . 

• -4min = {c,ab} leads to ka(,c) = /v^(a6) = 1 and KA.{abc) = — 1, and then 
to < c(c) + c{ab) — c{abc), which leads after the substitution c(c) = 1 to 
< l + c(afo) -c(a6c), 

• -4,nin = {o.,b} leads to K^(a) = k^(6) = 1 and K^(a6) ~ —1, and then, 
after substituing c(i) = 1, to < 2 — c{ab), 

• -4,nin = {a,b,c} leads to Ky!i(a) = HAib) = k^(c) = 1, K^(a6) = K^(ac) = 
Kyi (6c) = — 1 and KAiobc) = 1, which gives, after the substitution c(i) = 1, 
< 3 — c{ab) — c(ac) — c(6c) + c{abc). 

Thus, we see that the non- vacuous constraints are identical with the transformed 
elementary 77-constraints - see Example [5] 
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4.1.4 Remarks on the characteristic transformation 

Let us consider the eharactcristie transformation given by p^ - see i) l3.3.2l It 
can be viewed as a mapping t] m- Bt], where B is an ?n x n matrix, whose 
entries b [ S, {i\B) ] are specifed as follows: for \S\ > 1, i G N, B C N \ {i}, 

b [ S, {i\B) ]^6{ieS k S\{i}CB) = 6{SC{i}uB)-6{S'ZB). (35) 



There is a close relation to the matrix A introduced in (|16p . Indeed, there exists 
an invertible unimodular mxm matrix C such that B ~ C A. More specifically, 
the entries c[S^T] of C for non-empty sets S,T C_ N are given by 

r 6iSCT) ii\S\>2, 
c[J,J J - <^ 5{S^T) ii\S\ = 1. 

To see it write for fixed S" C iV, jS"! > 2 and a pair {i\B) with help of HH): 

Y,c[S,T]-a[T,mB)] = ^a[T,(»|i3)]=5][5{,jus(r)-<5B(r)] 

T#0 TDS TDS 

= J2h^}UB{T)-J2SBiT) 
TDS TDS 

= 5{S C{i}uB)-6{S'ZB)=b[S,{i\B)]. 
Analogously, for S* C A^, jS*] = 1 one has 
^c[S',r]-a[T,(i|B)]= ^a[r,(i|B)]=a[S',(i|B)]=%}(S) = b[S,(j|B)]. 

We leave to the reader to verify that the m x m- matrix D with entries d[T,R] 
for non-empty T, R <Z N given by 



^.j.^. p(TCi?).(-l)IMT| i 



if |T| > 2, 

if in = 1. 



is an inverse matrix to C. Since both C and its inverse D are integral matrices, 
they are both unimodular. The following observation appears to be important. 

Lemma 15 Both the matrix A given by Hid] ) and the matrix B given by Ii35]) 
are full row rank unimodular matrices. 

Proof. Since A — DB where D is an invertible unimodular m x m-matrix, it 
is enough to show that B is unimodular. By Proposition [T^ and B = CA we 
already know that B has full row rank. 

To show it is unimodular we re-label its columns and add some new ones. 
The original columns of B corresponding to pairs {i\B) with B ^ 9 are re- 
labelled by pairs (C : B) of sets ^ B C C C A^ with |C \ B| = 1; that is, {i\B) 
is replaced by (C : B) where C = {i} U B. The formula ((35|) implies 

b[S,{C:B)]=5{SCC)-6{SCB) for 5 C A^, |5| > 1. 
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The original column corresponding to a pair (i|0), i £ N is re-labelled by a 
singleton set R = {i}. Note that the column has the form Sr. The newly added 
columns are labelled by sets R C N , \R\ > 2 and defined as follows: 

b[S,R]:^S{SCR) for 5 C A^, |5| > 1. 

Observe that this formula also holds in case \R\ = 1. Now, it is enough to show 
that the extended matrix B is unimodular. 

Let B denote the m x m-submatrix of B corresponding to columns labelled 
by sets ^ i? C A^. It follows from the above description of columns in B that 
B = BE where the matrix E has the entries e [T, i?] for 7^ T, i? C iV and 

e[T,{C:B)]ioT9^TCN,(/}j^BCCCN,\C\B\ = l specified as follows: 

e[T,R] = 5{T^R), 

e[T,{C:B)] = 6{T ^ C) - d{T = B) . 

Therefore, it is enough to show that B is invertible unimodular matrix and E 
totally unimodular (cf. Theorem 21.6 in |2]). We leave to the reader to verify 
that the inverse matrix F to B has the entries 

f[R,U] =(5(i?CJ7)-(-l)l^\-"l for d) ^ R,U CN. 

Since B has integral inverse f , it is unimodular. The matrix E is totally uni- 
modular because it is the restriction of a network matrix (cf. §19.3 of [9]). 
More specifically, one can add one dummy row to E, labelled by S* = 0: put 
e[0,i?] = -1 for ^ i? C iV and e[0, (C : B)] = for any pair (C : B). 
We obtain a matrix with entries in { — 1,0,-1-1} such that each of its columns 
contains exactly once -1-1 and exactly once —1. As mentioned in the statement 
(18) of § 19.3 in [9], such a matrix is totally unimodular. Of course, it remains 
totally unimodular if the row corresponding to 5 = is again removed. D 



4.2 Transformation of cluster inequalities 

Luckily, these inequalites transform nicely to the framework of imsets. 

Lemma 16 Provided rj satisfies (0), the cluster inequality (0) for C C iV, 

|C| > 2 can he re-written either in terms of u-vectors as 

Y^ u(T)-(lCnT|-l) >0, (36) 

TCN, \CnT\>2 

or in terms of c-vectors as 

\C\-1- J2 c(5) • (-1)1^1 >0. (37) 

sec, |S|>2 
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Proof. By ([3]), it is enough to show that the foUowing equaUties hold 

\c\~ Y, c(5). (-1)1^1 = 1+ J2 u(T).(|cnr|-i) 

SCC,\S\>2 TCN,\CnT\>2 

= (*) 

ieC BCN\{i},Br]C=ll) 

Let (*) denote the first expression there and write by ([H])-®: 

{*) = \c\- E (-i)'^'-[i-E^(r)] 

sec, |S|>2 TDS 

= \c\- y: (-1)'"+ E (-1)'^' • E ^(^) 

sec, |S|>2 sec, |S|>2 TDS 



= 1+ E Eu(2^)- (-1)"" = !+ E ^(2^)- E (-1) 

sec, |S|>2 TDS T, |CnT|>2 SCCnT, |S|>2 



s\ 



This aheady proves the first equahty. Now, we substitute ([7]) in the last expres- 
sion (note |r| > 2 for T here) and change the order of summation: 

(*) = !+ Y. u(r)-(|cnT|-i) 

T, |CnT|>2 

Cl-l 



= 1+ ^ 5^{T)-{\cnT\~i) + J2 E ^('\^y 

T, |CnT|>2 iSiV BCJV\{i} 

{ ^ 5B(T).(|CnT|-l)- ^ 5wus(r)-{|CnT|-l)} 

T, |CnT|>2 T, |CnT|>2 

= i^i + E E ^(*i^)- 

ieiV flC]V\{i} 

{S{\CnB\>2)-{\CnB\ -l)-5{\Cn{{i}UB)\ > 2)- (|Cn{{i}UB)| -1)}. 

Now, we realize the that the inner expression in braces vanishes for i ^ C 
because then C D B = C O {{i} U B). Analogously, it vanishes if i G C but 
C n i? = 0. However, in case i & C and C fl i? 7^ it equals to —1. Thus, we 
write using ([2]) for i G C: 

W = |C| + E E vm)-S{teC&,CnB^iI})-i-i) 

ieN BCN\{i} 

= l^l-E E vm)-SiCnB^9) 

ieC BCN\{i} 

= E E vim-Y: E ^(*i^) 

ieC BCN\{i} ieC BCN\{i},BnC^<D 

= E E ^K*i^), 

ieC BCAr\{i}.BnC=0 
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which gives the third required equahty. D 

Example 17 Take A^ — {a,b,c}. By ([55]) . there are four transformed cluster 
inequahtics for \C\ > 2 breaking into two types: 

• u({a, b}) + u{{a, b, c}) > 0, (for C = {a, b}) 

• u({a, b}) + u({a, c}) + u({6, c}) + 2 • u({a, 6, c}) > 0. (for C = {a, b, c}) 

We observe they coincide with two types of non-specific inequahty constraints 
mentioned in Example [S] (sec ii l3.2.1[) . Nevertheless, the remaining non-specific 
constrain mentioned there, namely u{{a,b,c}) > 0, is not implied by the trans- 
formed cluster inequalities. For instance, the u-vcctor given by u(r) = ( — l)'^' 
for T C {a, &, c} shows that. 

The above example suggests that the transformed cluster inequalities are 
implied by the non-specific ones, which is indeed the case. 

Corollary 18 The cluster inequalities transformed to the framework of u-vectors 
1136]) follow from non-specific inequality constraints (0). 

Proof. By ([55)) . the cluster inequality for C C iV, |C| > 2 has the form 
(mc,u)= ^ mc('r)-u(T) > 0, with mc(T) = max {0, |CnT|-l} for T C iV. 

TCN 

The function mc is a special (standardized extreme) supermodular function, 
and, therefore, the inequality for C follows from ^. D 

Thus, we can summarize. The exact translation of the equality constraints 
([2|) to the framework of u-vectors arc the equality constraints (|4|) - sec ii l3.2.2l 
Provided ^ is valid, the exact translation of non-negativity constraints ((T|) are 
specific inequality constraints ([S]) (see Corollary [TT] in § l4.1.ip . and by Corollary 
[T51 the cluster inequalities ^ translate to some of the non-specific inequality 
constraints ^. In particular, we have 

Corollary 19 The u-polyhedron specified by ^-(^ is contained in the image 
of the rj-polyhedron specified by ([2P-0i by the mapping jy M- u*^ defined in Q), 
which is the polyhedron specified by (Mj, fj) and I136\) . 

5 LP relaxation 

To motivate the next result consider the case of three variables and transform 
Jaakkola et al.'s polyhedron J (ii l3.1.1|) to the framework of c-vectors. 
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Example 20 In Example [3] (see ? l4.ip , we transformed the elementary con- 
straints (UJ-© to the framework of characteristic imsets in case A^ = {a,b,c}. 
The result was a polyhedron given by fifteen inequalities and four equality con- 
straints. One can add the transformed cluster inequalities ([57]) to those con- 
straints. There are four such inequalities breaking into two types: 

• 0< l-c(a6), (for C = {a,&}) 

• < 2 - c{ab) - c(ac) - c{bc) + c{abc). (for C = {a, b, c}) 

We computed (again by Polymake [3]) the vertices of the resulting polyhedron 
(= the image of J) and got 12 vertices. The type representatives are as follows: 

[0,0,0,0], [1,0,0,0], [1, 1,0,0], [1, 1,0, 1], [1, 1, 1, 1], [1, 1, 1 J]. 

All the eleven lattice points here are characteristic imsets (for acyclic directed 
graphs), while the fractional vertex [1,1, 1,|] is not. However, it is a convex 
combination of vertices of the bigger polyhedron (= of the image of J'), namely 
of [2, 2, 2, 3] and [0, 0, 0, 0] - see Example il 

To get the exact polyhedral characterization of the characteristic imsct poly- 
tope (= of the convex hull of the set of characteristic imsets) in this case 
N = {a, b, c} one has to add the translation of the non-specific inequality con- 
strain u({a, 6,c}) > - see Example [TTl By d!])-®, it leads to 

• c{abc) < 1, 

which clearly cuts off the fractional vertex and the result is just the polytope 
spanned by the remaining eleven lattice points. Thus, the example shows that 
the basic inequalities for characteristic imsets mentioned in i^ lB.B.ll (Corollary[6]) 
are not implied by the transformed Jaakkola et al.'s inequalities ([I])-©. 

Nevertheless, we have observed that in case |A^| ~ 3 the only lattice points 
within the transformed polyhedron are the characteristic imsets. This leads to a 
hypothesis that this holds for any |A'^|. We confirm this conjecture now using the 
observation from Lemma [T5] Thus, by transforming Jaakkola et al.'s polyhedron 
J we get an explicit LP relaxation of the characteristic imset polytope. 

Corollary 21 The only lattice points within the polyhedron of c-vectors given 
by I110\). I128\) and 1^37^ are characteristic imsets (for acyclic directed graphs). 

Proof. Let us interpret any c- vector as an element of R^'i^^' = M™, that is, 
c(0) = 1 by a convention. We have already observed that the polyhedron given 
by (Uni), dil]) and ([32]) is the image of the polyhedron J specified by (H])-® by 
the transformation rj h-> Brj = c defined in (fT3|) - see § 14. 1.31 and § 14.21 

Assume c is a lattice point in the considered polyhedron. Thus, c has a pre- 
image a; e J, which implies that the polyhedron {x > 0; Bx = c} in R" is 
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non-empty. By Lemma [TSl B is unimodular, which allows us to use Theorem 
19.2 in [9] saying that a full row rank m x n-matrix B is unimodular if and only 
if the polyhedron {x > 0; Bx = c} is integral for any c G Z™. That means, it 
is the convex hull of its lattice points. In particular, since it is non-empty, it has 
at least one lattice point. Let us fix one such lattice point rj G Z", rj > with 
Bt] ~ c. It automatically satisfies ([T]); ^ holds because c{S) = 1 for |5| = 1 and 
Bt] = c. As (|37l) holds for c, t] satisfies all cluster inequalities ([3]) (by Lemma 
\W\\ . That means, rj is a lattice point in J. 

By Lemma SI rj is necessarily the code tjq of an acyclic directed graph G 
over N. By Lemma [71 its image c by the characteristic transformation is the 
characteristic imset cq corresponding to G. D 

Remark In the proof of Corollary [211 we have shown that if c is a lattice 
point in the cone generated by columns of B then it is a non-negative integer 
combination of columns of B. That means, in terms of § 16.4 of [9], the columns 
of B form the minimal integral Hilbert basis of the cone generated by them. 
Following the terminology from commutative algebra, the semigroup generated 
by columns of B is normal [SI [151 US] . 

Nevertheless, because of the one-to-one correspondence between u-vectors 
and c- vectors, we have an analogous result in the framework of standard imsets. 

Corollary 22 The polyhedron of u-vectors given by Q), ^ and I136\) is an LP 
relaxation of the standard imset polytope. 

Proof. As explained in §[331 the mapping u n> c given by ([S])-® is invertible 
and maps lattice points to lattice points. Moreover, Q is transformed to pU)) . 
([SI to ([ill) by Lemma [HI and ([5S)) to ^7^ by Lemma [TSl Thus, the image of the 
polyhedron is the polyhedron of c- vectors from Corollary [211 The pre- images of 
characteristic imsets are standard imsets. D 

Note that one can also prove Corollary [22l directly, by the method Corollary 
[2Ilwas proved. Indeed, one can use an analogous consideration where the matrix 
B is replaced by A and the vector c by 6u for an u-vector ~ see the relation (|17p 
mentioned in i) l4.1.1l 

Thus, we have an explicit LP relaxation of the standard imset polytope and 
the conjecture from |14j is confirmed: 

Corollary 23 The polyhedron of u-vectors given by 0)-(0) is an LP relaxation 
of the standard imset polytope. 

Proof This follows from Corollaries [Hand [H □ 
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Conclusions 

Corollary [211 gives an explicit LP relaxation of the characteristic imset polytope. 
Nevertheless, some of the inequalities p8)) are superfluous because they follow 
from the remaining inequalities. Moreover, perhaps adding the basic inequalities 
from Corollary ini allows one further reduction of the number of inequalities. 

Another research direction is to look for even more loose LP relaxation of 
the standard/characteristic imset polytope, which however, has a less number 
of inequalities. 
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